Climate Change Explorer¶

ClimateChangeExplorer is a data analysis project focusing on global climate trends from 1961 to 2022, with predictive insights up to 2050. Leveraging Jupyter Notebooks, it offers data cleaning, visualization, and predictive modeling for understanding and forecasting climate change dynamics.¶
In [1]:
import pandas as pd
import numpy as np
import matplotlib as plt
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.express as px

Data exploration¶

Visualization of my DataFrame:¶

  • ObjectId: A unique identifier for each row in the table.
  • Country: The name of the country.
  • ISO2: The 2-letter ISO code of the country.
  • ISO3: The 3-letter ISO code of the country.
  • Indicator: Describes the type of climate indicator, in this case, 'Temperature change with respect to a baseline climatology, corresponding to the period 1951-1980'.
  • Unit: The unit in which the indicator is measured, in this case, degrees Celsius.
  • Source: The source of the data.
  • CTS_Code: Code related to surface temperature change.
  • CTS_Name: Name related to surface temperature change.
  • CTS_Full_Descriptor: Complete description of surface temperature change.
  • F1961 to F2022: Columns representing temperature changes for each year from 1961 to 2022.
In [2]:
df = pd.read_csv('climate_change_indicators.csv')
In [3]:
df.head()
Out[3]:
ObjectId Country ISO2 ISO3 Indicator Unit Source CTS_Code CTS_Name CTS_Full_Descriptor ... F2013 F2014 F2015 F2016 F2017 F2018 F2019 F2020 F2021 F2022
0 1 Afghanistan, Islamic Rep. of AF AFG Temperature change with respect to a baseline ... Degree Celsius Food and Agriculture Organization of the Unite... ECCS Surface Temperature Change Environment, Climate Change, Climate Indicator... ... 1.281 0.456 1.093 1.555 1.540 1.544 0.910 0.498 1.327 2.012
1 2 Albania AL ALB Temperature change with respect to a baseline ... Degree Celsius Food and Agriculture Organization of the Unite... ECCS Surface Temperature Change Environment, Climate Change, Climate Indicator... ... 1.333 1.198 1.569 1.464 1.121 2.028 1.675 1.498 1.536 1.518
2 3 Algeria DZ DZA Temperature change with respect to a baseline ... Degree Celsius Food and Agriculture Organization of the Unite... ECCS Surface Temperature Change Environment, Climate Change, Climate Indicator... ... 1.192 1.690 1.121 1.757 1.512 1.210 1.115 1.926 2.330 1.688
3 4 American Samoa AS ASM Temperature change with respect to a baseline ... Degree Celsius Food and Agriculture Organization of the Unite... ECCS Surface Temperature Change Environment, Climate Change, Climate Indicator... ... 1.257 1.170 1.009 1.539 1.435 1.189 1.539 1.430 1.268 1.256
4 5 Andorra, Principality of AD AND Temperature change with respect to a baseline ... Degree Celsius Food and Agriculture Organization of the Unite... ECCS Surface Temperature Change Environment, Climate Change, Climate Indicator... ... 0.831 1.946 1.690 1.990 1.925 1.919 1.964 2.562 1.533 3.243

5 rows × 72 columns

In [4]:
df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 225 entries, 0 to 224
Data columns (total 72 columns):
 #   Column               Non-Null Count  Dtype  
---  ------               --------------  -----  
 0   ObjectId             225 non-null    int64  
 1   Country              225 non-null    object 
 2   ISO2                 223 non-null    object 
 3   ISO3                 225 non-null    object 
 4   Indicator            225 non-null    object 
 5   Unit                 225 non-null    object 
 6   Source               225 non-null    object 
 7   CTS_Code             225 non-null    object 
 8   CTS_Name             225 non-null    object 
 9   CTS_Full_Descriptor  225 non-null    object 
 10  F1961                188 non-null    float64
 11  F1962                189 non-null    float64
 12  F1963                188 non-null    float64
 13  F1964                188 non-null    float64
 14  F1965                188 non-null    float64
 15  F1966                192 non-null    float64
 16  F1967                191 non-null    float64
 17  F1968                191 non-null    float64
 18  F1969                190 non-null    float64
 19  F1970                189 non-null    float64
 20  F1971                191 non-null    float64
 21  F1972                192 non-null    float64
 22  F1973                193 non-null    float64
 23  F1974                192 non-null    float64
 24  F1975                188 non-null    float64
 25  F1976                189 non-null    float64
 26  F1977                185 non-null    float64
 27  F1978                189 non-null    float64
 28  F1979                189 non-null    float64
 29  F1980                191 non-null    float64
 30  F1981                191 non-null    float64
 31  F1982                192 non-null    float64
 32  F1983                190 non-null    float64
 33  F1984                188 non-null    float64
 34  F1985                188 non-null    float64
 35  F1986                190 non-null    float64
 36  F1987                190 non-null    float64
 37  F1988                190 non-null    float64
 38  F1989                190 non-null    float64
 39  F1990                189 non-null    float64
 40  F1991                188 non-null    float64
 41  F1992                208 non-null    float64
 42  F1993                209 non-null    float64
 43  F1994                208 non-null    float64
 44  F1995                210 non-null    float64
 45  F1996                210 non-null    float64
 46  F1997                207 non-null    float64
 47  F1998                210 non-null    float64
 48  F1999                209 non-null    float64
 49  F2000                209 non-null    float64
 50  F2001                208 non-null    float64
 51  F2002                212 non-null    float64
 52  F2003                214 non-null    float64
 53  F2004                213 non-null    float64
 54  F2005                212 non-null    float64
 55  F2006                215 non-null    float64
 56  F2007                217 non-null    float64
 57  F2008                212 non-null    float64
 58  F2009                212 non-null    float64
 59  F2010                215 non-null    float64
 60  F2011                217 non-null    float64
 61  F2012                215 non-null    float64
 62  F2013                216 non-null    float64
 63  F2014                216 non-null    float64
 64  F2015                216 non-null    float64
 65  F2016                213 non-null    float64
 66  F2017                214 non-null    float64
 67  F2018                213 non-null    float64
 68  F2019                213 non-null    float64
 69  F2020                212 non-null    float64
 70  F2021                213 non-null    float64
 71  F2022                213 non-null    float64
dtypes: float64(62), int64(1), object(9)
memory usage: 126.7+ KB
In [5]:
df.describe()
Out[5]:
ObjectId F1961 F1962 F1963 F1964 F1965 F1966 F1967 F1968 F1969 ... F2013 F2014 F2015 F2016 F2017 F2018 F2019 F2020 F2021 F2022
count 225.000000 188.000000 189.000000 188.000000 188.000000 188.000000 192.000000 191.000000 191.000000 190.000000 ... 216.000000 216.000000 216.000000 213.000000 214.000000 213.000000 213.000000 212.000000 213.000000 213.000000
mean 113.000000 0.163053 -0.013476 -0.006043 -0.070059 -0.247027 0.105505 -0.110832 -0.199110 0.157942 ... 0.931199 1.114815 1.269773 1.439521 1.280785 1.302113 1.443061 1.552038 1.343531 1.382113
std 65.096083 0.405080 0.341812 0.387348 0.309305 0.270734 0.378423 0.339484 0.270131 0.308540 ... 0.321595 0.564903 0.462162 0.401091 0.393999 0.596786 0.467510 0.621930 0.484692 0.669279
min 1.000000 -0.694000 -0.908000 -1.270000 -0.877000 -1.064000 -1.801000 -1.048000 -1.634000 -0.900000 ... 0.118000 -0.092000 -0.430000 0.250000 0.017000 0.238000 0.050000 0.229000 -0.425000 -1.305000
25% 57.000000 -0.097000 -0.164000 -0.205500 -0.236500 -0.392500 -0.035750 -0.259500 -0.340000 -0.009000 ... 0.743500 0.744000 1.017750 1.147000 1.027500 0.865000 1.169000 1.161750 1.019000 0.878000
50% 113.000000 0.064500 -0.056000 -0.003000 -0.056000 -0.230500 0.098000 -0.146000 -0.187000 0.204000 ... 0.897000 0.986500 1.215000 1.446000 1.282000 1.125000 1.412000 1.477000 1.327000 1.315000
75% 169.000000 0.318500 0.114000 0.230500 0.132500 -0.091500 0.277000 0.015000 -0.067000 0.349000 ... 1.187500 1.335500 1.520500 1.714000 1.535000 1.834000 1.698000 1.826250 1.629000 1.918000
max 225.000000 1.892000 0.998000 1.202000 1.097000 0.857000 1.151000 1.134000 0.476000 0.939000 ... 1.643000 2.704000 2.613000 2.459000 2.493000 2.772000 2.689000 3.691000 2.676000 3.243000

8 rows × 63 columns

Data cleaning and transformation¶

¶

  • Dropping duplicate values (no duplicate values exist).
  • Transformation of year columns (remove 'F' from column names).
  • Handling null values (transform null values to the mean, avoid data loss).
In [6]:
df.shape
Out[6]:
(225, 72)
In [7]:
column_names= df.columns
new_column_names = {col: col[1:] for col in column_names if col.startswith('F')}
df.rename(columns=new_column_names, inplace=True)
In [8]:
df.drop_duplicates()
Out[8]:
ObjectId Country ISO2 ISO3 Indicator Unit Source CTS_Code CTS_Name CTS_Full_Descriptor ... 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022
0 1 Afghanistan, Islamic Rep. of AF AFG Temperature change with respect to a baseline ... Degree Celsius Food and Agriculture Organization of the Unite... ECCS Surface Temperature Change Environment, Climate Change, Climate Indicator... ... 1.281 0.456 1.093 1.555 1.540 1.544 0.910 0.498 1.327 2.012
1 2 Albania AL ALB Temperature change with respect to a baseline ... Degree Celsius Food and Agriculture Organization of the Unite... ECCS Surface Temperature Change Environment, Climate Change, Climate Indicator... ... 1.333 1.198 1.569 1.464 1.121 2.028 1.675 1.498 1.536 1.518
2 3 Algeria DZ DZA Temperature change with respect to a baseline ... Degree Celsius Food and Agriculture Organization of the Unite... ECCS Surface Temperature Change Environment, Climate Change, Climate Indicator... ... 1.192 1.690 1.121 1.757 1.512 1.210 1.115 1.926 2.330 1.688
3 4 American Samoa AS ASM Temperature change with respect to a baseline ... Degree Celsius Food and Agriculture Organization of the Unite... ECCS Surface Temperature Change Environment, Climate Change, Climate Indicator... ... 1.257 1.170 1.009 1.539 1.435 1.189 1.539 1.430 1.268 1.256
4 5 Andorra, Principality of AD AND Temperature change with respect to a baseline ... Degree Celsius Food and Agriculture Organization of the Unite... ECCS Surface Temperature Change Environment, Climate Change, Climate Indicator... ... 0.831 1.946 1.690 1.990 1.925 1.919 1.964 2.562 1.533 3.243
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
220 221 Western Sahara EH ESH Temperature change with respect to a baseline ... Degree Celsius Food and Agriculture Organization of the Unite... ECCS Surface Temperature Change Environment, Climate Change, Climate Indicator... ... 1.423 1.401 1.510 1.732 2.204 0.942 1.477 2.069 1.593 1.970
221 222 World NaN WLD Temperature change with respect to a baseline ... Degree Celsius Food and Agriculture Organization of the Unite... ECCS Surface Temperature Change Environment, Climate Change, Climate Indicator... ... 1.016 1.053 1.412 1.660 1.429 1.290 1.444 1.711 1.447 1.394
222 223 Yemen, Rep. of YE YEM Temperature change with respect to a baseline ... Degree Celsius Food and Agriculture Organization of the Unite... ECCS Surface Temperature Change Environment, Climate Change, Climate Indicator... ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
223 224 Zambia ZM ZMB Temperature change with respect to a baseline ... Degree Celsius Food and Agriculture Organization of the Unite... ECCS Surface Temperature Change Environment, Climate Change, Climate Indicator... ... 0.790 0.917 1.450 1.401 0.105 0.648 0.855 0.891 0.822 0.686
224 225 Zimbabwe ZW ZWE Temperature change with respect to a baseline ... Degree Celsius Food and Agriculture Organization of the Unite... ECCS Surface Temperature Change Environment, Climate Change, Climate Indicator... ... 0.118 0.025 0.970 1.270 0.088 0.453 0.925 0.389 -0.125 -0.490

225 rows × 72 columns

In [9]:
df_notnull = df.dropna()
In [10]:
df_notnull.shape
Out[10]:
(156, 72)
In [11]:
df.isnull().sum()
Out[11]:
ObjectId      0
Country       0
ISO2          2
ISO3          0
Indicator     0
             ..
2018         12
2019         12
2020         13
2021         12
2022         12
Length: 72, dtype: int64
In [12]:
df=df.drop('ISO2', axis=1)
In [13]:
columns_float = df.select_dtypes(include='float64').columns
In [14]:
for column in columns_float:
    mean_column = df[column].mean()
    df[column] = df[column].fillna(mean_column)
In [15]:
df.isnull().sum()
Out[15]:
ObjectId     0
Country      0
ISO3         0
Indicator    0
Unit         0
            ..
2018         0
2019         0
2020         0
2021         0
2022         0
Length: 71, dtype: int64
In [16]:
df.shape
Out[16]:
(225, 71)
In [17]:
df.head()
Out[17]:
ObjectId Country ISO3 Indicator Unit Source CTS_Code CTS_Name CTS_Full_Descriptor 1961 ... 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022
0 1 Afghanistan, Islamic Rep. of AFG Temperature change with respect to a baseline ... Degree Celsius Food and Agriculture Organization of the Unite... ECCS Surface Temperature Change Environment, Climate Change, Climate Indicator... -0.113 ... 1.281 0.456 1.093 1.555 1.540 1.544 0.910 0.498 1.327 2.012
1 2 Albania ALB Temperature change with respect to a baseline ... Degree Celsius Food and Agriculture Organization of the Unite... ECCS Surface Temperature Change Environment, Climate Change, Climate Indicator... 0.627 ... 1.333 1.198 1.569 1.464 1.121 2.028 1.675 1.498 1.536 1.518
2 3 Algeria DZA Temperature change with respect to a baseline ... Degree Celsius Food and Agriculture Organization of the Unite... ECCS Surface Temperature Change Environment, Climate Change, Climate Indicator... 0.164 ... 1.192 1.690 1.121 1.757 1.512 1.210 1.115 1.926 2.330 1.688
3 4 American Samoa ASM Temperature change with respect to a baseline ... Degree Celsius Food and Agriculture Organization of the Unite... ECCS Surface Temperature Change Environment, Climate Change, Climate Indicator... 0.079 ... 1.257 1.170 1.009 1.539 1.435 1.189 1.539 1.430 1.268 1.256
4 5 Andorra, Principality of AND Temperature change with respect to a baseline ... Degree Celsius Food and Agriculture Organization of the Unite... ECCS Surface Temperature Change Environment, Climate Change, Climate Indicator... 0.736 ... 0.831 1.946 1.690 1.990 1.925 1.919 1.964 2.562 1.533 3.243

5 rows × 71 columns

Descriptive statistics¶

Exploratory data analysis¶

  • I work with numerical data and give them appropriate formatting.

  • I obtain a single mean for the data in order to understand the trends.

    • General Mean: The overall mean of climate change is approximately 0.5133 degrees Celsius. This indicates that, on average, there has been an increase in global temperature compared to the reference period.
    • General Mode: The overall mode is around 0.1658 degrees Celsius. This suggests that the most commonly observed temperature is around 0.1658 degrees Celsius. While it may not be the most representative temperature of the sample, it shows the central tendency of the most frequent values.
    • General Median: The overall median is 0.4255 degrees Celsius. The median is useful because it is not affected by extreme or outlier values in the data. It indicates that the midpoint of the climate change values is around 0.4255 degrees Celsius.
  • The graphs show a clear trend towards increasing temperatures in recent years.

In [18]:
pd.set_option('display.float_format', lambda x: '%.2f' % x)
df.describe()
Out[18]:
ObjectId 1961 1962 1963 1964 1965 1966 1967 1968 1969 ... 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022
count 225.00 225.00 225.00 225.00 225.00 225.00 225.00 225.00 225.00 225.00 ... 225.00 225.00 225.00 225.00 225.00 225.00 225.00 225.00 225.00 225.00
mean 113.00 0.16 -0.01 -0.01 -0.07 -0.25 0.11 -0.11 -0.20 0.16 ... 0.93 1.11 1.27 1.44 1.28 1.30 1.44 1.55 1.34 1.38
std 65.10 0.37 0.31 0.35 0.28 0.25 0.35 0.31 0.25 0.28 ... 0.32 0.55 0.45 0.39 0.38 0.58 0.45 0.60 0.47 0.65
min 1.00 -0.69 -0.91 -1.27 -0.88 -1.06 -1.80 -1.05 -1.63 -0.90 ... 0.12 -0.09 -0.43 0.25 0.02 0.24 0.05 0.23 -0.42 -1.30
25% 57.00 -0.07 -0.14 -0.17 -0.21 -0.36 -0.01 -0.25 -0.29 0.02 ... 0.75 0.76 1.03 1.16 1.04 0.88 1.18 1.19 1.03 0.89
50% 113.00 0.15 -0.02 -0.01 -0.07 -0.25 0.11 -0.11 -0.20 0.16 ... 0.92 1.01 1.23 1.44 1.28 1.15 1.44 1.50 1.34 1.35
75% 169.00 0.25 0.08 0.17 0.10 -0.11 0.24 -0.04 -0.10 0.30 ... 1.18 1.31 1.52 1.69 1.51 1.61 1.68 1.78 1.60 1.86
max 225.00 1.89 1.00 1.20 1.10 0.86 1.15 1.13 0.48 0.94 ... 1.64 2.70 2.61 2.46 2.49 2.77 2.69 3.69 2.68 3.24

8 rows × 63 columns

In [19]:
df=df.drop('ObjectId', axis=1)
In [20]:
columns_float = df.select_dtypes(include='float64')
In [21]:
columns_float_numeric = columns_float.apply(pd.to_numeric, errors='coerce')
In [22]:
mean_general = columns_float_numeric.stack().mean()
mode_general = pd.Series(columns_float_numeric.stack()).mode()[0]
median_general = pd.Series(columns_float_numeric.stack()).median()
In [23]:
print(f"Mean general: {mean_general}")
print(f"Mode general: {mode_general}")
print(f"Median general: {median_general}")
Mean general: 0.5133257312792895
Mode general: 0.1658162162162162
Median general: 0.4255
In [24]:
mean_by_year = columns_float.mean(axis=0)
In [25]:
plt.figure(figsize=(20, 6))
plt.bar(mean_by_year.index, mean_by_year.values, color='skyblue')
plt.title('Climate change per year worldwide')
plt.xlabel('Year')
plt.ylabel('Climate change average')
plt.xticks(rotation=45)
plt.grid(axis='y', linestyle='--')
plt.show()
No description has been provided for this image
In [26]:
labels = columns_float_numeric.columns
In [27]:
plt.figure(figsize=(20, 6))
plt.boxplot(columns_float_numeric.values, labels=labels)
plt.title('Climate change distribution per year worldwide')
plt.xlabel('Year')
plt.ylabel('Climate change')
plt.xticks(rotation=45)
plt.grid(axis='y', linestyle='--')
plt.show()
No description has been provided for this image

Measures of Dispersion¶

  • General Maximum Range: It is useful for understanding the breadth of the observed values.
  • Maximum Variance: The maximum variance is approximately 0.4239 square degrees Celsius. Variance measures how spread out the values are relative to the mean. A high variance indicates greater data dispersion around the mean.
  • Maximum Standard Deviation: The maximum standard deviation is approximately 0.6511 degrees Celsius. A high standard deviation indicates greater data dispersion relative to the mean.

"These measures of dispersion show that there is considerable variability in the climate change data over time, suggesting that climate changes are not consistent and can be quite significant at times. This highlights the importance of considering variability and not just measures of central tendency when analyzing climate data."

In [28]:
range= columns_float.max() - columns_float.min()
print(range)
1961   2.59
1962   1.91
1963   2.47
1964   1.97
1965   1.92
       ... 
2018   2.53
2019   2.64
2020   3.46
2021   3.10
2022   4.55
Length: 62, dtype: float64
In [29]:
range_general = range.max()
print(range_general)
4.548
In [30]:
columns_float.var().max()
Out[30]:
0.42393784507042254
In [31]:
columns_float.std().max()
Out[31]:
0.6511050952576108

Correlation¶

In the correlation plot, it is observed that the values are generally close to 0 or -0 before 1999 and then tend to be closer to 0 and 1 after that year. This may indicate a change in temperature trends over time.¶

  • Hypothesis of Climate Change: The shift in the distribution of temperature values after 1999 could be indicative of ongoing climate change. Values closer to 0 and 1 might reflect an increase in average temperatures and greater variability in recorded temperatures.
In [32]:
columns_year = df.filter(regex=r'^\d{4}$', axis=1)
In [33]:
df_new= df[columns_year.columns].copy()
In [34]:
df_new.head()
Out[34]:
1961 1962 1963 1964 1965 1966 1967 1968 1969 1970 ... 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022
0 -0.11 -0.16 0.85 -0.76 -0.24 0.23 -0.37 -0.42 -0.54 0.81 ... 1.28 0.46 1.09 1.55 1.54 1.54 0.91 0.50 1.33 2.01
1 0.63 0.33 0.07 -0.17 -0.39 0.56 -0.07 0.08 -0.01 -0.11 ... 1.33 1.20 1.57 1.46 1.12 2.03 1.68 1.50 1.54 1.52
2 0.16 0.11 0.08 0.25 -0.10 0.43 -0.03 -0.07 0.29 0.12 ... 1.19 1.69 1.12 1.76 1.51 1.21 1.11 1.93 2.33 1.69
3 0.08 -0.04 0.17 -0.14 -0.56 0.18 -0.37 -0.19 0.13 -0.05 ... 1.26 1.17 1.01 1.54 1.44 1.19 1.54 1.43 1.27 1.26
4 0.74 0.11 -0.75 0.31 -0.49 0.41 0.64 0.02 -0.14 0.12 ... 0.83 1.95 1.69 1.99 1.93 1.92 1.96 2.56 1.53 3.24

5 rows × 62 columns

In [35]:
df_new.quantile(0.25).mean()
Out[35]:
0.30609677419354836
In [36]:
df_new.quantile(0.5).mean()
Out[36]:
0.49509436541016033
In [37]:
df_new.quantile(0.75).mean()
Out[37]:
0.7064193548387095
In [38]:
correlation_matrix = df_new.corr()
mask = np.triu(np.ones_like(correlation_matrix, dtype=bool))
plt.figure(figsize=(30, 30))
sns.heatmap(correlation_matrix, mask=mask, cmap='coolwarm', fmt=".0f", annot=True)
plt.show()
No description has been provided for this image

Interactive visualization¶

  • I create graphs with the Plotly library that show quite interesting data. We can see in the box plot how the values tend to approach the mean, reducing the outliers. And in the following scatter plots, we observe a notable trend towards increasing temperatures worldwide.
In [39]:
fig = px.box(columns_float_numeric, labels={"value": "Value"})
fig.update_layout(title='Box plot showing the trend of temperature change each year')
fig.show()
In [40]:
fig = px.scatter(df, x="1961", y="Country", color ="1961")
fig.update_layout(title='Scatter plot showing the variance of temperature in 1961 in each country')
fig.show()
In [41]:
fig = px.scatter(df, x="1996", y="Country", color ="1996")
fig.update_layout(title='Scatter plot showing the variance of temperature in 1996 in each country')
fig.show()
In [42]:
fig = px.scatter(df, x="2000", y="Country", color ="2000")
fig.update_layout(title='Scatter plot showing the variance of temperature in 2000 in each country')
fig.show()
In [43]:
fig = px.scatter(df, x="2022", y="Country", color ="2022")
fig.update_layout(title='Scatter plot showing the variance of temperature in 2022 in each country')
fig.show()